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Introduction 

When comparing networks (with the same number of nodes) with direct methods, a num- 
ber of possible distances is already available in literature. Among others, two of the most 
common families are the set of edit-like distances and the spectral distances. The functions 
in the former family quantitatively evaluate the differences between two networks in terms 
of minimum number of edit operations (with possibly different costs) transforming one net- 
work into the other, that is, deletion and insertion of links, while spectral measures relies on 
functions of the eigenvalues of one of the connectivity matrices of the underlying graph. 

A noticeable issue affecting edit distance is the fact of being local, i.e. not taking into 
account the global structure of the networks but only summing the contributions coming 
from each single link. On the other hand, spectral measures cannot distinguish isomorphic 
or isospectral graphs. We propose here a possible solution to overcome both issues: combining 
together an edit (the normalized Hamming) and a spectral distance (the normalized Ipsen- 
Mikhailov) in a product metric we will denote as HIM. The proposed solution can be applied 
to any pair of simple networks, i.e., undirected or directed graphs and weighted or unweighted 
nets, provided the weights lie in the unitary interval [0,1]. In what follows we define the two 
components and the HIM metric itself, with a few examples of applications. 

Notations 

Hereafter A/i and A/2 are two simple networks on N nodes, described by the corresponding 

adjacency matrices A\ and A 2 , with a^\a^ £ J 7 , where T — F 2 = {0,1} for unweighted 

graphs and T — [0, 1] for weighted networks. We will moreover denote by ijv the identity 
(\ - o\ 

N x N matrix Djy = I }. "' I , by Ijv the unitary N x N matrix with all entries equal to 
Vo - 1/ 

one and by ©at the null N x N matrix with all entries equal to zero. Moreover, we denote by 
£n the empty network with N nodes and no links (with adjacency matrix Ojv) and by J-'n 
the undirected full network with N nodes and all possible N(N — 1) links (whose adjacency 
matrix is Ijv — D^v)- 

For a directed network Af^~ , following the convention in [T], a link •— >• is represented by 
setting a,ji = 1 in the corresponding adjacency matrix A^, which thus is, in general, not 
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symmetric. For instance, the matrix Ajj-t represents the full directed network Tjf, with all 
possible iV 2 — N directed links •—>■•. The example for N — 4 is shown in Fig. [1] 

Hamming distance 

Since computing (plain) edit distance is a NP-hard task, we choose the simplest member 
of the edit distance family, the Hamming distance, which evaluates the presence/ absence of 
matching links on the two networks being compared. 

Hamming distance is one of the most common dissimilarity measures in coding and string 
theory, and recently it has been used for (biological) network comparison in [3J [3] . 

The definition of the Hamming distance is the following: 

Hammmg(M,JV2) = ]T \A$ - A\f\ . 

l<ijtj<N 

To guarantee independence from the network dimension (number of nodes), we normalize 
the above function by the factor rj — Hamrmng(£Ar, J 7 ^) = N(N — 1): 

l<i^j<N 




Figure 1 : The full directed network F± . 
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When M\ and A<2 are unweighted networks, H(JVi, N2) is just the fraction of different 
matching links (over the total number N(N — 1) of possible links) between the two graphs. In 
all cases, H(Afi, A/2) G [0, 1], where the lower bound is attained only for identical networks 
Ai = A2 and the upper bound 1 is reached whenever the two networks are complementary 

A 1 + A 2 = t N -i N = ( M. 

\i 1 ••• 0/ 



Ipsen-Mikhailov distance 

Originally introduced in [J] as a tool for network reconstruction from its Laplacian spectrum, 
the definition of the Ipsen-Mikhailov e metric follows the dynamical interpretation of a TV- 
nodes network as a TV-atoms molecules connected by identical elastic strings as in Fig. [21 
where the pattern of connections is defined by the adjacency matrix of the corresponding 
network. The dynamical system is described by the set of N differential equations 

N 

Xi + Ajj (xj -xj)=0 for i = Q,--- ,N-1 , (2) 

We recall that the Laplacian matrix L of an undirected network is defined as the difference 
between the degree D and the adjacency A matrices L = D — A, where D is the diagonal 
matrix with vertex degrees as entries. L is positive semidefinite and singular [SJ [71 [H [S] , 
so its eigenvalues are = Ao < Ai < • • • < \n-i- The vibrational frequencies ca; for the 
network model in Eq. [2] are given by the eigenvalues of the Laplacian matrix of the network: 
Xi = cuf, with Ao = loq = 0. In [5], the Laplacian spectrum is called the vibrational spec- 
trum. Estimates (also asymptotic) of the eigenvalues distribution are available for complex 
networks [5]. Moreover, the relation between the spectral properties and the structure and 
the dynamics of a network are discussed in [TU1 [TTJ [T^] . 




Figure 2: A three nodes network as a oscillatory system. 
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The spectral density for a graph as the sum of Lorcntz distributions is defined as 



N-l 

P(lj, 7) = K V -— — - , (3) 

1— 1 v ' 

where 7 is the common width and K is the normalization constant defined as 

K =—i " ' W 



1 



E 

i=l 



- w 4 ) 2 + 7 2 



/>oo 

so that / 7)dw = 1. The scale parameter 7 specifies the half-width at half-maximum, 

Jo 

which is equal to half the interquartile range. 

In Fig. [7] we show the plot of the Lorentz distribution for two networks. 

Then the spectral distance e 7 between two graphs G and H on N nodes with densities 
Pg(u},j) and ph(uj,j) can then be defined as 



e 7 (G,H) = ^jj^ [p G (w,7) - p H (uj,~f)} 2 du . (5) 

The highest value of e 7 is reached, for each N, when evaluating the distance between £ n and 
J-jv- Defining 7 as the (unique) solution of 

e^En.Fn) = 1 , (6) 

we can now define the Ipsen-Mikahilov distance as 



e(G,H) = e^(G,H)= ^ ^ {p G (u,j) - Ph (uj^)} 2 , (7) 

so that e(G,H) £ [0,1] with upper bound attained only for (G,H) — (S^.F^). 

A detailed description of the uniqueness of the solution of Eq. [6] is described in Appendix 
[A1 Clearly, isospectral networks (and thus also isomorphic networks) cannot be distinguished 
by this class of measures, so this is a distance between classes of isospectral graphs: although 
the number of isospectral networks is negligible for large number of nodes [13) , their fraction 
is relevant for smaller networks. 



The HIM distance 

Consider now two copies of the space N(N) of all simple undirected networks on N nodes, 
and endow the first copy with the Hamming metric H and the second copy with the Ipsen- 
Mikhailov distance e. Then the two obtained pairs (N(N),H) and (N(N),e) are metric 
spaces. Define now on their Cartesian product the L2 (Euclidean) product metric [14], 
normalized by the factor ^ to set its upper bound to 1. We call this function the HIM 
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Figure 3: The [0, 1] x [0, 1] Hamming/Ipsen-Mikhailov space 



metric on the product space, that, with the natural correspondence of th same network in 
the two spaces, becomes a distance on N(N): 



HIM(M,^2) = ^||(ff(A/i,A/" 2 ),e(A/i,A/2))|| 2 = ^- v / ^ 2 (A^i,AA 2 ) + e 2 (M,AA 2 ) . (8) 

The metric HIM(A/i, A/2) is bounded in the interval [0, 1], with lower bound attained for every 
couple of identical networks, and upper bound attained only on the pair (£;v> J^n)- 

Because of the different nature of the two components of the product metric, HIM will 
be nonzero for non-identical isomorphic/isospectral graphs. 

Consider now the [0, 1] x [0, 1] Hamming/Ipsen-Mikhailov space: a point P(x, y) rep- 
resents the distance of two networks Gi,G2, whose coordinates are x = H(Gi,G2) and 
y = e(Gi, G2, and the norm of P is \/2 times the distance HIM(Gi, G2). If we (roughly) split 
the Hamming/Ipsen-Mikhailov space into four main zones I, II, III, IV as in Fig. [31 we can say 
that two networks whose distances correspond to a point in zone I are quite close both in 
terms of matching links and of structure, while those falling in the zone III are very different 
with respect to both characteristics. Networks corresponding to a point in zone II have many 
common links, but their structure is rather different, while a point in zone IV indicates two 
networks with few common links, but with similar structure. 

Examples of couples of networks living in different zones of the Hamming/Ipsen-Mikhailov 
space are shown in the sections below. 



HIM for directed networks 

In case of a pair of directed networks, the corresponding adjacency matrices are not symmet- 
ric, and the same happens for the Laplacian matrices. Therefore, their laplacian spectra lie in 
C rather than R, and thus computing the Ipsen-Mikhailov distance would require extending 
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the Lorentzian distribution to the complex plane. A simpler solution can be obtained by 
transforming the directed network into an undirected (bipartite) one , following the 
procedure cited in pQ: for each node Xj in , the graph has two nodes x\ and xf (where 
I and O stands for In and Out respectively) and for each directed link x\ — > Xj in there 
is a link xf — Xj in D^. If the adjacency matrix for is A D r, the corresponding matrix for 

is A = (J* ^q T ) i w.r.t. to the node ordering xf, x§, ■ ■ ■ x°, x{, . . . , x l n . 
An example of the above transformation is shown in Fig. 0J 




xl 





/0 1 l\ 



1 1 

1 

10 10 

\1 0/ 



Figure 4: A directed network on three nodes and the equivalent undirected network 
on six nodes, together with their adjacency matrices. 



Thus it is possible to define HIM(A/^, A/^ ) as HIM (A/7, A/J ) after substituing the normal- 
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izing factors rj and 7 with the corresponding fp and 7" derived by imposing the conditions 
Hamming(£/V: -Fn^/W — 1 an d e=yt(^jv,-^v) = 1, so that HIM(£at, .Fat) = 1 by using Eq[8j 
It is immediate to compute fj> = 2N(N — 1), while 7^ can be numerically computed as for 
7: details are given in Appendix |B| 

Examples of applications 
A minimal example 

Consider the two networks Ix,l2 € -^V(8) with corresponding adjacency matrices A !l ,A l2 
shown in Fig. 15161 
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Figure 5: Adjacency matrix 




graphical representation of I\ 
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/O 1 1 1 0\ 
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Figure 6: Adjacency matrix and graphical representation of I2 



The Hamming distance between I\ and I2 is 



H{h,h) 



1 



N(N — 1) 



1 

56 



E 

i<^i<8 



/0 1 1 1 l\ 

1 1 1 1 

1 1 

1 1 1 1 

1 1 1 
1 1 1 1 
1 1 1 1 

\1 1 1 0/ 



28 
56 
0.5 



From the spectral point of view, the corresponding Laplacian matrices and eigenvalues 
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spec(L /l ) = 



spec(i/ 2 ) 





0.657077 
1 

2.529317 
3 
4 
4 

4.813607 




0.340321 
1.145088 

3 

3 

3.854912 
4.659679 



From the above spectra, we can compute the corresponding Lorentz distributions pi {1 2} (u, 7), 
where 7 = 0.4450034: their plots are shown in Fig. [71 
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Figure 8: HIM(Ii,i2) in the Hamming/Ipsen-Mikhailov space 



The resulting Ipsen-Mikhailov distance is 



[piAun) ~ PiAu,l)\ I = 0.1004144 



so that the HIM distance results 

HIM(Ji,/ 2 ) = JL.||(F(Ji, I 2 ),e(h,I 2 )) | | a « 0.707168V0.5 2 + 0.1004144 2 sa 0.3606127 . 

The situation can be graphically represented as in Fig. [5] the two networks are quite different 
in terms of matching links, but their structures are not so far away. 



A larger study 

Fixed the number o 

weighted networks, which can be grouped into isomorphism classes. As anticipated before, 



iV(iV-l) 

Fixed the number of nodes N, there are exactly 2 2 different simple undirected un 
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isomorphic graphs cannot be distiguished by spectral metrics, while their mutual Hamming 
distances are non zero, since their links are in different positions. As an example, for N = 3 
there are 8 networks grouped in 4 isomorphism classes, for N — A there are 11 isomorphism 
classes including a total of 64 graphs and for N — 5 34 classes with 1024 networks (for 
N = 6,7, the number of classes is respectively 156 e 1044). 

To give an overview of a broader situation, we compute a number of mutual distances 
between networks with a given number of nodes (all possible couples for N = 3, 4, 5 and a 
subset of them for N = 15) and we display the results in Fig. |H1 To select a good range of 
variability for the networks with 15 nodes, we select the empty graph, the full graph (with 
105 nodes) and 10 different graphs with i edges each, for 1 < i < 104. 

As shown by the plots, all possible situations can occur, apart from points in the northwest 
corner of zone II which are the rarest. For instance, the point P(l, 0) in Fig.[9jb) corresponds 
to 6 different pairs (0\,02) of networks with 4 nodes with maximal Hamming distance and 
minimal spectral distance: as an example, we show one of these pairs in Fig. 1101 

Biological networks 

In [15] . the authors used the Keller algorithm to infer the gene regulatory networks of 
Drosophila melanogaster from a time series of gene expression data measured during its full 
life cycle. They selected 66 time points during the developmental cycle, spanning across four 
different stages (Embryonic - time points 1-30, Larval - t.p. 31-40, Pupal - t.p. 41-58, Adult 
- t.p. 59-66), following the dynamics of 588 gene ontological groups and then constructing a 
time series of inferred networks JVj]]. Hereafter we evaluate the structural differences between 
Ni and the initial network N%, as measured by the HIM distance: the resulting plot is dis- 
played in Fig. 1111 The largest variations, both between consecutive terms and with respect 
to the initial network N\, occur in the embrional stage (E). In particular, it is interesting to 
note that the dynamics of the networks move Ni away from N± until time points 23, then the 
following terms start getting closer again to Ni in terms of HIM distance: such behaviour 
was detected also in the original paper, but only qualitatively, while the introduced metrics 
can provide a quantitative assessment of the occurring differences. 

Finally, it can be appreciated the different range of the two distances: while Hamming dis- 
tance ranges between and 0.0223, the Ipsen-Mikhailov distance has 0.0851 as its maximum, 
indicating an higher variability of the networks in terms of structure rather than matching 
links. 
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Figure 10: Pair of networks with Hamming distance one and Ipsen-Mikhailov distance zero. 
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A Uniqueness of 7 

Fix the number N of nodes, and consider the two extremal networks £n and Fn, whose 
Laplacian spectrum is respectively 



(A) 



spec(L(£jv)) = ((),■■■ ,0) and spec(L(.F/v)) = (0, AT, ■ ■ ■ 



N) 



N JV-1 



so that uii = for the empty network and Wj = >/~N for the fully connected network, for 
i = 1,...,N- 1. 

The Lorentz distribution for the empty network is thus 




Kj(N - 1) 
7 2 + cj 2 



14 



where K can be computed as 



K 



1 



+00 



7 (7V-1) 



~/ 2 +uj 2 



so that 



(N-l) 
1 



(JV - 1)tt 



arctan 



OOL 



P£ w (w,7) 



^7(7V - 1) 
7 2 + cj 2 
27 



7r(7 2 + OJ 2 ) 



For the fully connected network we have 

N-l 



where K is 



7 



^ 7 2 + ( W - 

N-l 



7 



£f 7 2 + (w - VAQ 2 
-yK{N -1) 
y 2 +iu 2 + N- 2ojVN ' 



if = 



7 (AT-1) 



+00 



dw 



7 2 + lo 2 + N - 2luVN 
1 



l(N-l) 
7 



arctan 
1 



+00 




(N - 1) (f + arctan 
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so that 



l(N-l) 



lK{N-l) 
7 2 + uj 2 + N - 2ujVn 
1 

(N - 1) (f + arctan J^fJ) ' J 2 + ^ 2 + N - 2u>VN 



U + arctan (^f)) (l 2 + u 2 + N - 2ljVn) 
Thus, we expand Eq. [5] as follows: 
1 = e 7 (£V, Tn) 



2" 



^io \k{i 2 + u 2 ) U + arc tan 0(f)) (l 2 + u 2 + N - 2u^/n) 



did 



where 



A 2 duj 



A 



B z duj - 2 / ABduj , 
Jo 



27 



7r(7 2 + UJ 2 ) 



B = 



7 



arctan (^f) ) (j 2 + lo 2 + N - 2uVN 
The three terms in Eq. [TT]can be expanded as follows: 

27 



+ 00 



A 2 dco 



+oo / 

„ U(7 2 +w 2 ) 
4 7 2 r +ao dm 
^J ( 7 2 +^ 2 ) 2 
4 7 2 1 



dw 



7T 2 2 7 3 



70; 



7 2 +cj 2 



arctan 



77T 
1 



"7T" 
-2. 



7T7 
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+oo />+O0 

B 2 du = / 
Jo 



r+oo 

Jo 



(§ + arctan (^) ) ( 7 2 + lj 2 + N - 2uj^n) 

t 

(f + arctan (^r)) (l 2 + ^ 2 + JV - 2^7^)' 



dw 



7' 



(f 



/■ + OO 



dw 



arctan (^)) (t 2 + ^ 2 + # - 2^7^)' 



7 



(13) 



2 7 3 (| + arctan (^f))' 
1 



7 (cj - VN) lo-VN 
+ arctan 



7 2 + (lu - v 7 ^) 2 



+ 00 



27 (f 



+ arctan 



/ 7T 7^ / 

o 7T H — ^ 77 + arctan 

y^ 2 ^ 2 7 2 + 7V ^7 
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r+oo r- 
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o ^(7 2 +^ 2 ) (| + ar ctan (^)) ( 7 2 + w 2 + N - 2lo^N 

2 • 7 • 2 7 f +QO dw 

( 7 2 + lj 2 ) (-f 2 +co 2 + N- 2ujVn) 
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Figure 12: (a) Behaviour of /(7, 10) and solution of Eq. [6j (b) Behaviour of /(7, AT) 



Plugging Eqs. ll2|13|l"4l into Eq. El we obtain: 

1 = e 7 (£;v, Fn) 
1 

7T7 



27(f 



arctan 



•1 



-47 



(f + arctan (^)) tt(47 2 + TV) 



7 2 + iV 
log 



+ arctan 



TV 
7 



(15) 



2 + N \ 7 



Consider now the function f(N, 7) = Cj(£n , J'n) — 1: for a fixed value of AT, it is a 
monotonically decreasing function of 7, so the equation Eq. [5] has an unique solution 7. In 
Fig. [T27 a) we display the situation for iV = 10. As a function of N, the behaviour of /(AT, 7) 
is not monotonic, but its range is rather narrow, as exemplified in Fig. 112( b). 



B Uniqueness of 7^ 



The spectra of the laplacian matrices of the two extremal graphs £ N and P N are now 
spec(£(£Jr)) = (0, • • • , 0) and spec(L(J^)) = (0, N - 2, • • • , N - 2, N, ■ ■ ■ , N, 2N - 2) 
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N-l 



N-l 



It follows that 
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£ n (2N - 1)tt 
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(2iV- 1)§ + (iV- 1) farctan^^ + arctan^ J + arctan - 



Thus the equation 
(whose solution is the normalizing factor 7^) reads as follows: 



eJ£^,J^) = 1 
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(16) 

Introduce now a few shorthands: define, for T, U £ 1R, the following integral 



4-00 



dw _ Jm(T) UT = U, 

( 7 2 + - VT) 2 )(7 2 + (w - Vf7) 2 ) ~ \i(T, f/) r£T?U. 



Then, 



and 



i ( 7 2 arctan ^ + T arctan ^ + 7^) n 
M \ T ) = 7 5 + T7 3 h ' 



L(T, U) 



log ( 7 2 + £7) + log ( 7 2 + T) 11 + arctan (^t) + arctan (^F) 



(4 7 2 + r + 3C/)VT- (4 7 2 + 3T + [7)VC7 4 7 3 + T 7 - 2 VTVUj + Uj 

To shorten notations, define furthermore 

2~i W 

Z=— W = MN-l)K^ W'= . 

7T n ' f n N-l 

With the aforementioned positions, Eg. [TBI becomes 

1 = Z 2 M(0) + W 2 M(N - 2) + W 2 M(N) + W l2 M(2N - 2) 

- 2ZWL(0, N-2)- 2M(0, iV) - 2ZW'L(0, 2N - 2) (17) 
+ 2W 2 L(N - 2, JV) + 2WWL(iV - 2, 2iV - 2) + 2WW'L(iV, 27V _ 2) . 

As in the undirected case, for each TV Eq. [17] has an unique solution 7^, whose value is 
quite close to 7, as shown in Tab. [Bj 
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N 


7 


7 T 


5 


0.4272836 


0.3866861 


10 


0.4517012 


0.4300291 


50 


0.4752742 


0.4704579 


100 


0.4777976 


0.4753463 


500 


0.4787492 


0.4782538 


1000 


0.4785596 


0.4783119 


10000 


0.4779060 


0.4778813 



Table 1: Comparison of 7 and 7^ for different number of nodes N. 
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